Overview

Brought to you by YData

Dataset statistics

Number of variables29
Number of observations106
Missing cells100
Missing cells (%)3.3%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory24.1 KiB
Average record size in memory233.2 B

Variable types

Text10
DateTime2
Categorical11
Numeric6

Alerts

suffix has constant value "Unknown" Constant
state has constant value "Massachusetts" Constant
county is highly overall correlated with fips and 3 other fieldsHigh correlation
fips is highly overall correlated with county and 2 other fieldsHigh correlation
gender is highly overall correlated with prefixHigh correlation
healthcare_coverage is highly overall correlated with maidenHigh correlation
healthcare_expenses is highly overall correlated with maidenHigh correlation
income is highly overall correlated with income_categoryHigh correlation
income_category is highly overall correlated with incomeHigh correlation
lat is highly overall correlated with countyHigh correlation
lon is highly overall correlated with county and 1 other fieldsHigh correlation
maiden is highly overall correlated with healthcare_coverage and 1 other fieldsHigh correlation
marital is highly overall correlated with prefixHigh correlation
prefix is highly overall correlated with gender and 1 other fieldsHigh correlation
zip is highly overall correlated with county and 1 other fieldsHigh correlation
maiden is highly imbalanced (56.7%) Imbalance
race is highly imbalanced (59.2%) Imbalance
deathdate has 100 (94.3%) missing values Missing
id has unique values Unique
ssn has unique values Unique
address has unique values Unique
lat has unique values Unique
lon has unique values Unique
healthcare_expenses has unique values Unique
zip has 35 (33.0%) zeros Zeros
healthcare_coverage has 7 (6.6%) zeros Zeros

Reproduction

Analysis started2024-12-02 13:18:13.660335
Analysis finished2024-12-02 13:18:18.585945
Duration4.93 seconds
Software versionydata-profiling vv4.12.0
Download configurationconfig.json

Variables

id
Text

Unique 

Distinct106
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size980.0 B
2024-12-03T00:18:18.812596image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/

Length

Max length36
Median length36
Mean length36
Min length36

Characters and Unicode

Total characters3816
Distinct characters17
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique106 ?
Unique (%)100.0%

Sample

1st row30a6452c-4297-a1ac-977a-6a23237c7b46
2nd row34a4dcc4-35fb-6ad5-ab98-be285c586a4f
3rd row7179458e-d6e3-c723-2530-d4acfe1c2668
4th row37c177ea-4398-fb7a-29fa-70eb3d673876
5th row0fef2411-21f0-a269-82fb-c42b55471405
ValueCountFrequency (%)
50ca7edb-0dee-35e6-5d8f-66fbcb0b37c1 1
 
0.9%
f339a5f7-0b09-3072-2b01-7c8e8ca2c1fc 1
 
0.9%
780fe740-20fb-07ee-1fbd-3fafa9f5df91 1
 
0.9%
cca2c7f0-a2aa-94e5-ccea-cb78a7d38652 1
 
0.9%
3c7e37b0-c610-bc9a-d75a-f782e5dc7598 1
 
0.9%
37713015-cfb5-bf1a-70eb-970101f32341 1
 
0.9%
d426334c-a982-3a31-7e0f-ca3c7fe01310 1
 
0.9%
cb1b46a1-9cb5-1187-ccc5-9fb7b98aa957 1
 
0.9%
d1622e8b-d26b-ec81-ffcb-ec4bf2af385b 1
 
0.9%
34a4dcc4-35fb-6ad5-ab98-be285c586a4f 1
 
0.9%
Other values (96) 96
90.6%
2024-12-03T00:18:19.092734image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 424
 
11.1%
e 238
 
6.2%
6 223
 
5.8%
0 220
 
5.8%
7 220
 
5.8%
c 219
 
5.7%
1 218
 
5.7%
4 218
 
5.7%
3 216
 
5.7%
2 215
 
5.6%
Other values (7) 1405
36.8%

Most occurring categories

ValueCountFrequency (%)
(unknown) 3816
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
- 424
 
11.1%
e 238
 
6.2%
6 223
 
5.8%
0 220
 
5.8%
7 220
 
5.8%
c 219
 
5.7%
1 218
 
5.7%
4 218
 
5.7%
3 216
 
5.7%
2 215
 
5.6%
Other values (7) 1405
36.8%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 3816
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
- 424
 
11.1%
e 238
 
6.2%
6 223
 
5.8%
0 220
 
5.8%
7 220
 
5.8%
c 219
 
5.7%
1 218
 
5.7%
4 218
 
5.7%
3 216
 
5.7%
2 215
 
5.6%
Other values (7) 1405
36.8%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 3816
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
- 424
 
11.1%
e 238
 
6.2%
6 223
 
5.8%
0 220
 
5.8%
7 220
 
5.8%
c 219
 
5.7%
1 218
 
5.7%
4 218
 
5.7%
3 216
 
5.7%
2 215
 
5.6%
Other values (7) 1405
36.8%
Distinct100
Distinct (%)94.3%
Missing0
Missing (%)0.0%
Memory size980.0 B
Minimum1914-03-03 00:00:00
Maximum2023-03-01 00:00:00
2024-12-03T00:18:19.209910image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2024-12-03T00:18:19.354263image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)

deathdate
Date

Missing 

Distinct6
Distinct (%)100.0%
Missing100
Missing (%)94.3%
Memory size980.0 B
Minimum1972-06-01 00:00:00
Maximum2022-01-17 00:00:00
2024-12-03T00:18:19.459765image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2024-12-03T00:18:19.565107image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
Histogram with fixed size bins (bins=6)

ssn
Text

Unique 

Distinct106
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size980.0 B
2024-12-03T00:18:19.777369image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/

Length

Max length11
Median length11
Mean length11
Min length11

Characters and Unicode

Total characters1166
Distinct characters11
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique106 ?
Unique (%)100.0%

Sample

1st row999-52-8591
2nd row999-75-3953
3rd row999-70-1925
4th row999-27-9779
5th row999-50-8977
ValueCountFrequency (%)
999-27-5104 1
 
0.9%
999-66-2146 1
 
0.9%
999-71-1449 1
 
0.9%
999-36-7955 1
 
0.9%
999-74-5035 1
 
0.9%
999-80-8977 1
 
0.9%
999-80-9251 1
 
0.9%
999-83-1974 1
 
0.9%
999-55-3884 1
 
0.9%
999-75-3953 1
 
0.9%
Other values (96) 96
90.6%
2024-12-03T00:18:20.091453image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
9 387
33.2%
- 212
18.2%
3 77
 
6.6%
5 75
 
6.4%
7 70
 
6.0%
1 69
 
5.9%
8 65
 
5.6%
4 64
 
5.5%
2 60
 
5.1%
6 55
 
4.7%

Most occurring categories

ValueCountFrequency (%)
(unknown) 1166
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
9 387
33.2%
- 212
18.2%
3 77
 
6.6%
5 75
 
6.4%
7 70
 
6.0%
1 69
 
5.9%
8 65
 
5.6%
4 64
 
5.5%
2 60
 
5.1%
6 55
 
4.7%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 1166
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
9 387
33.2%
- 212
18.2%
3 77
 
6.6%
5 75
 
6.4%
7 70
 
6.0%
1 69
 
5.9%
8 65
 
5.6%
4 64
 
5.5%
2 60
 
5.1%
6 55
 
4.7%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 1166
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
9 387
33.2%
- 212
18.2%
3 77
 
6.6%
5 75
 
6.4%
7 70
 
6.0%
1 69
 
5.9%
8 65
 
5.6%
4 64
 
5.5%
2 60
 
5.1%
6 55
 
4.7%
Distinct85
Distinct (%)80.2%
Missing0
Missing (%)0.0%
Memory size980.0 B
2024-12-03T00:18:20.293150image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/

Length

Max length9
Median length9
Mean length8.5849057
Min length7

Characters and Unicode

Total characters910
Distinct characters16
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique84 ?
Unique (%)79.2%

Sample

1st rowS99996852
2nd rowS99993577
3rd rowUnknown
4th rowS99995100
5th rowUnknown
ValueCountFrequency (%)
unknown 22
 
20.8%
s99911538 1
 
0.9%
s99917166 1
 
0.9%
s99959101 1
 
0.9%
s99975537 1
 
0.9%
s99996852 1
 
0.9%
s99943171 1
 
0.9%
s99941458 1
 
0.9%
s99947055 1
 
0.9%
s99941595 1
 
0.9%
Other values (75) 75
70.8%
2024-12-03T00:18:20.618244image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
9 291
32.0%
S 84
 
9.2%
n 66
 
7.3%
7 56
 
6.2%
5 49
 
5.4%
1 48
 
5.3%
6 47
 
5.2%
3 46
 
5.1%
8 39
 
4.3%
4 36
 
4.0%
Other values (6) 148
16.3%

Most occurring categories

ValueCountFrequency (%)
(unknown) 910
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
9 291
32.0%
S 84
 
9.2%
n 66
 
7.3%
7 56
 
6.2%
5 49
 
5.4%
1 48
 
5.3%
6 47
 
5.2%
3 46
 
5.1%
8 39
 
4.3%
4 36
 
4.0%
Other values (6) 148
16.3%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 910
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
9 291
32.0%
S 84
 
9.2%
n 66
 
7.3%
7 56
 
6.2%
5 49
 
5.4%
1 48
 
5.3%
6 47
 
5.2%
3 46
 
5.1%
8 39
 
4.3%
4 36
 
4.0%
Other values (6) 148
16.3%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 910
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
9 291
32.0%
S 84
 
9.2%
n 66
 
7.3%
7 56
 
6.2%
5 49
 
5.4%
1 48
 
5.3%
6 47
 
5.2%
3 46
 
5.1%
8 39
 
4.3%
4 36
 
4.0%
Other values (6) 148
16.3%
Distinct76
Distinct (%)71.7%
Missing0
Missing (%)0.0%
Memory size980.0 B
2024-12-03T00:18:20.812400image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/

Length

Max length10
Median length10
Mean length9.0754717
Min length7

Characters and Unicode

Total characters962
Distinct characters16
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique75 ?
Unique (%)70.8%

Sample

1st rowX47758697X
2nd rowX28173268X
3rd rowUnknown
4th rowX83694889X
5th rowUnknown
ValueCountFrequency (%)
unknown 31
29.2%
x83694889x 1
 
0.9%
x20976043x 1
 
0.9%
x31272602x 1
 
0.9%
x37637991x 1
 
0.9%
x59458953x 1
 
0.9%
x55687474x 1
 
0.9%
x14417836x 1
 
0.9%
x57524913x 1
 
0.9%
x27670495x 1
 
0.9%
Other values (66) 66
62.3%
2024-12-03T00:18:21.149117image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
X 150
15.6%
n 93
9.7%
7 70
 
7.3%
4 67
 
7.0%
6 64
 
6.7%
3 63
 
6.5%
9 58
 
6.0%
8 58
 
6.0%
1 56
 
5.8%
5 55
 
5.7%
Other values (6) 228
23.7%

Most occurring categories

ValueCountFrequency (%)
(unknown) 962
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
X 150
15.6%
n 93
9.7%
7 70
 
7.3%
4 67
 
7.0%
6 64
 
6.7%
3 63
 
6.5%
9 58
 
6.0%
8 58
 
6.0%
1 56
 
5.8%
5 55
 
5.7%
Other values (6) 228
23.7%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 962
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
X 150
15.6%
n 93
9.7%
7 70
 
7.3%
4 67
 
7.0%
6 64
 
6.7%
3 63
 
6.5%
9 58
 
6.0%
8 58
 
6.0%
1 56
 
5.8%
5 55
 
5.7%
Other values (6) 228
23.7%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 962
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
X 150
15.6%
n 93
9.7%
7 70
 
7.3%
4 67
 
7.0%
6 64
 
6.7%
3 63
 
6.5%
9 58
 
6.0%
8 58
 
6.0%
1 56
 
5.8%
5 55
 
5.7%
Other values (6) 228
23.7%

prefix
Categorical

High correlation 

Distinct4
Distinct (%)3.8%
Missing0
Missing (%)0.0%
Memory size980.0 B
Mr.
31 
Mrs.
28 
Unknown
27 
Ms.
20 

Length

Max length7
Median length4
Mean length4.2830189
Min length3

Characters and Unicode

Total characters454
Distinct characters9
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowMr.
2nd rowMr.
3rd rowUnknown
4th rowMrs.
5th rowUnknown

Common Values

ValueCountFrequency (%)
Mr. 31
29.2%
Mrs. 28
26.4%
Unknown 27
25.5%
Ms. 20
18.9%

Length

2024-12-03T00:18:21.286517image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-12-03T00:18:21.396168image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
ValueCountFrequency (%)
mr 31
29.2%
mrs 28
26.4%
unknown 27
25.5%
ms 20
18.9%

Most occurring characters

ValueCountFrequency (%)
n 81
17.8%
M 79
17.4%
. 79
17.4%
r 59
13.0%
s 48
10.6%
U 27
 
5.9%
k 27
 
5.9%
o 27
 
5.9%
w 27
 
5.9%

Most occurring categories

ValueCountFrequency (%)
(unknown) 454
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
n 81
17.8%
M 79
17.4%
. 79
17.4%
r 59
13.0%
s 48
10.6%
U 27
 
5.9%
k 27
 
5.9%
o 27
 
5.9%
w 27
 
5.9%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 454
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
n 81
17.8%
M 79
17.4%
. 79
17.4%
r 59
13.0%
s 48
10.6%
U 27
 
5.9%
k 27
 
5.9%
o 27
 
5.9%
w 27
 
5.9%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 454
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
n 81
17.8%
M 79
17.4%
. 79
17.4%
r 59
13.0%
s 48
10.6%
U 27
 
5.9%
k 27
 
5.9%
o 27
 
5.9%
w 27
 
5.9%
Distinct104
Distinct (%)98.1%
Missing0
Missing (%)0.0%
Memory size980.0 B
2024-12-03T00:18:21.573808image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/

Length

Max length14
Median length13
Mean length8.8584906
Min length5

Characters and Unicode

Total characters939
Distinct characters57
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique102 ?
Unique (%)96.2%

Sample

1st rowJoshua658
2nd rowBennie663
3rd rowHunter736
4th rowCarlyn477
5th rowRobin66
ValueCountFrequency (%)
hershel911 2
 
1.9%
homero668 2
 
1.9%
antonia30 1
 
0.9%
bennie663 1
 
0.9%
hunter736 1
 
0.9%
carlyn477 1
 
0.9%
robin66 1
 
0.9%
arthur650 1
 
0.9%
caryl47 1
 
0.9%
willian804 1
 
0.9%
Other values (94) 94
88.7%
2024-12-03T00:18:21.875419image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
a 82
 
8.7%
e 65
 
6.9%
6 51
 
5.4%
r 50
 
5.3%
i 50
 
5.3%
n 48
 
5.1%
l 36
 
3.8%
o 35
 
3.7%
4 34
 
3.6%
7 32
 
3.4%
Other values (47) 456
48.6%

Most occurring categories

ValueCountFrequency (%)
(unknown) 939
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
a 82
 
8.7%
e 65
 
6.9%
6 51
 
5.4%
r 50
 
5.3%
i 50
 
5.3%
n 48
 
5.1%
l 36
 
3.8%
o 35
 
3.7%
4 34
 
3.6%
7 32
 
3.4%
Other values (47) 456
48.6%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 939
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
a 82
 
8.7%
e 65
 
6.9%
6 51
 
5.4%
r 50
 
5.3%
i 50
 
5.3%
n 48
 
5.1%
l 36
 
3.8%
o 35
 
3.7%
4 34
 
3.6%
7 32
 
3.4%
Other values (47) 456
48.6%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 939
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
a 82
 
8.7%
e 65
 
6.9%
6 51
 
5.4%
r 50
 
5.3%
i 50
 
5.3%
n 48
 
5.1%
l 36
 
3.8%
o 35
 
3.7%
4 34
 
3.6%
7 32
 
3.4%
Other values (47) 456
48.6%
Distinct89
Distinct (%)84.0%
Missing0
Missing (%)0.0%
Memory size980.0 B
2024-12-03T00:18:22.086341image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/

Length

Max length14
Median length13
Mean length8.6981132
Min length6

Characters and Unicode

Total characters922
Distinct characters55
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique87 ?
Unique (%)82.1%

Sample

1st rowAlvin56
2nd rowUnknown
3rd rowMckinley734
4th rowFlorencia449
5th rowJeramy610
ValueCountFrequency (%)
unknown 17
 
15.9%
danita413 2
 
1.9%
mckinley734 1
 
0.9%
florencia449 1
 
0.9%
jeramy610 1
 
0.9%
lelia627 1
 
0.9%
shelton25 1
 
0.9%
jordan900 1
 
0.9%
salley758 1
 
0.9%
bobbi508 1
 
0.9%
Other values (80) 80
74.8%
2024-12-03T00:18:22.433437image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
n 99
 
10.7%
a 76
 
8.2%
e 50
 
5.4%
i 47
 
5.1%
o 45
 
4.9%
r 42
 
4.6%
l 32
 
3.5%
7 32
 
3.5%
4 30
 
3.3%
8 30
 
3.3%
Other values (45) 439
47.6%

Most occurring categories

ValueCountFrequency (%)
(unknown) 922
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
n 99
 
10.7%
a 76
 
8.2%
e 50
 
5.4%
i 47
 
5.1%
o 45
 
4.9%
r 42
 
4.6%
l 32
 
3.5%
7 32
 
3.5%
4 30
 
3.3%
8 30
 
3.3%
Other values (45) 439
47.6%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 922
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
n 99
 
10.7%
a 76
 
8.2%
e 50
 
5.4%
i 47
 
5.1%
o 45
 
4.9%
r 42
 
4.6%
l 32
 
3.5%
7 32
 
3.5%
4 30
 
3.3%
8 30
 
3.3%
Other values (45) 439
47.6%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 922
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
n 99
 
10.7%
a 76
 
8.2%
e 50
 
5.4%
i 47
 
5.1%
o 45
 
4.9%
r 42
 
4.6%
l 32
 
3.5%
7 32
 
3.5%
4 30
 
3.3%
8 30
 
3.3%
Other values (45) 439
47.6%
Distinct93
Distinct (%)87.7%
Missing0
Missing (%)0.0%
Memory size980.0 B
2024-12-03T00:18:22.653629image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/

Length

Max length14
Median length12
Mean length9.8679245
Min length6

Characters and Unicode

Total characters1046
Distinct characters57
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique81 ?
Unique (%)76.4%

Sample

1st rowKunde533
2nd rowEbert178
3rd rowGerlach374
4th rowWilliamson769
5th rowGleichner915
ValueCountFrequency (%)
brakus656 3
 
2.8%
franecki195 2
 
1.9%
balistreri607 2
 
1.9%
carter549 2
 
1.9%
greenholt190 2
 
1.9%
schinner682 2
 
1.9%
schuppe920 2
 
1.9%
friesen796 2
 
1.9%
ebert178 2
 
1.9%
schiller186 2
 
1.9%
Other values (83) 85
80.2%
2024-12-03T00:18:22.979925image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 87
 
8.3%
r 67
 
6.4%
n 52
 
5.0%
i 52
 
5.0%
a 51
 
4.9%
9 48
 
4.6%
l 44
 
4.2%
5 37
 
3.5%
6 34
 
3.3%
s 34
 
3.3%
Other values (47) 540
51.6%

Most occurring categories

ValueCountFrequency (%)
(unknown) 1046
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 87
 
8.3%
r 67
 
6.4%
n 52
 
5.0%
i 52
 
5.0%
a 51
 
4.9%
9 48
 
4.6%
l 44
 
4.2%
5 37
 
3.5%
6 34
 
3.3%
s 34
 
3.3%
Other values (47) 540
51.6%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 1046
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 87
 
8.3%
r 67
 
6.4%
n 52
 
5.0%
i 52
 
5.0%
a 51
 
4.9%
9 48
 
4.6%
l 44
 
4.2%
5 37
 
3.5%
6 34
 
3.3%
s 34
 
3.3%
Other values (47) 540
51.6%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 1046
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 87
 
8.3%
r 67
 
6.4%
n 52
 
5.0%
i 52
 
5.0%
a 51
 
4.9%
9 48
 
4.6%
l 44
 
4.2%
5 37
 
3.5%
6 34
 
3.3%
s 34
 
3.3%
Other values (47) 540
51.6%

suffix
Categorical

Constant 

Distinct1
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Memory size980.0 B
Unknown
106 

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters742
Distinct characters5
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowUnknown
2nd rowUnknown
3rd rowUnknown
4th rowUnknown
5th rowUnknown

Common Values

ValueCountFrequency (%)
Unknown 106
100.0%

Length

2024-12-03T00:18:23.108847image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-12-03T00:18:23.193205image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
ValueCountFrequency (%)
unknown 106
100.0%

Most occurring characters

ValueCountFrequency (%)
n 318
42.9%
U 106
 
14.3%
k 106
 
14.3%
o 106
 
14.3%
w 106
 
14.3%

Most occurring categories

ValueCountFrequency (%)
(unknown) 742
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
n 318
42.9%
U 106
 
14.3%
k 106
 
14.3%
o 106
 
14.3%
w 106
 
14.3%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 742
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
n 318
42.9%
U 106
 
14.3%
k 106
 
14.3%
o 106
 
14.3%
w 106
 
14.3%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 742
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
n 318
42.9%
U 106
 
14.3%
k 106
 
14.3%
o 106
 
14.3%
w 106
 
14.3%

maiden
Categorical

High correlation  Imbalance 

Distinct29
Distinct (%)27.4%
Missing0
Missing (%)0.0%
Memory size980.0 B
Unknown
78 
Rogahn59
 
1
Lubowitz58
 
1
Romaguera67
 
1
Langosh790
 
1
Other values (24)
24 

Length

Max length14
Median length7
Mean length7.5660377
Min length7

Characters and Unicode

Total characters802
Distinct characters46
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique28 ?
Unique (%)26.4%

Sample

1st rowUnknown
2nd rowUnknown
3rd rowUnknown
4th rowRogahn59
5th rowUnknown

Common Values

ValueCountFrequency (%)
Unknown 78
73.6%
Rogahn59 1
 
0.9%
Lubowitz58 1
 
0.9%
Romaguera67 1
 
0.9%
Langosh790 1
 
0.9%
Wolf938 1
 
0.9%
Kshlerin58 1
 
0.9%
Durgan499 1
 
0.9%
Abreu185 1
 
0.9%
Hagenes547 1
 
0.9%
Other values (19) 19
 
17.9%

Length

2024-12-03T00:18:23.311334image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
unknown 78
73.6%
rogahn59 1
 
0.9%
lubowitz58 1
 
0.9%
romaguera67 1
 
0.9%
langosh790 1
 
0.9%
wolf938 1
 
0.9%
kshlerin58 1
 
0.9%
durgan499 1
 
0.9%
abreu185 1
 
0.9%
hagenes547 1
 
0.9%
Other values (19) 19
 
17.9%

Most occurring characters

ValueCountFrequency (%)
n 246
30.7%
o 90
 
11.2%
k 81
 
10.1%
w 79
 
9.9%
U 78
 
9.7%
a 19
 
2.4%
r 17
 
2.1%
e 16
 
2.0%
1 12
 
1.5%
9 11
 
1.4%
Other values (36) 153
19.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 802
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
n 246
30.7%
o 90
 
11.2%
k 81
 
10.1%
w 79
 
9.9%
U 78
 
9.7%
a 19
 
2.4%
r 17
 
2.1%
e 16
 
2.0%
1 12
 
1.5%
9 11
 
1.4%
Other values (36) 153
19.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 802
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
n 246
30.7%
o 90
 
11.2%
k 81
 
10.1%
w 79
 
9.9%
U 78
 
9.7%
a 19
 
2.4%
r 17
 
2.1%
e 16
 
2.0%
1 12
 
1.5%
9 11
 
1.4%
Other values (36) 153
19.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 802
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
n 246
30.7%
o 90
 
11.2%
k 81
 
10.1%
w 79
 
9.9%
U 78
 
9.7%
a 19
 
2.4%
r 17
 
2.1%
e 16
 
2.0%
1 12
 
1.5%
9 11
 
1.4%
Other values (36) 153
19.1%

marital
Categorical

High correlation 

Distinct5
Distinct (%)4.7%
Missing0
Missing (%)0.0%
Memory size980.0 B
Unknown
42 
M
33 
D
15 
S
12 
W
 
4

Length

Max length7
Median length1
Mean length3.3773585
Min length1

Characters and Unicode

Total characters358
Distinct characters9
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowM
2nd rowD
3rd rowUnknown
4th rowM
5th rowUnknown

Common Values

ValueCountFrequency (%)
Unknown 42
39.6%
M 33
31.1%
D 15
 
14.2%
S 12
 
11.3%
W 4
 
3.8%

Length

2024-12-03T00:18:23.430638image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-12-03T00:18:23.527107image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
ValueCountFrequency (%)
unknown 42
39.6%
m 33
31.1%
d 15
 
14.2%
s 12
 
11.3%
w 4
 
3.8%

Most occurring characters

ValueCountFrequency (%)
n 126
35.2%
U 42
 
11.7%
k 42
 
11.7%
o 42
 
11.7%
w 42
 
11.7%
M 33
 
9.2%
D 15
 
4.2%
S 12
 
3.4%
W 4
 
1.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 358
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
n 126
35.2%
U 42
 
11.7%
k 42
 
11.7%
o 42
 
11.7%
w 42
 
11.7%
M 33
 
9.2%
D 15
 
4.2%
S 12
 
3.4%
W 4
 
1.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 358
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
n 126
35.2%
U 42
 
11.7%
k 42
 
11.7%
o 42
 
11.7%
w 42
 
11.7%
M 33
 
9.2%
D 15
 
4.2%
S 12
 
3.4%
W 4
 
1.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 358
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
n 126
35.2%
U 42
 
11.7%
k 42
 
11.7%
o 42
 
11.7%
w 42
 
11.7%
M 33
 
9.2%
D 15
 
4.2%
S 12
 
3.4%
W 4
 
1.1%

race
Categorical

Imbalance 

Distinct5
Distinct (%)4.7%
Missing0
Missing (%)0.0%
Memory size980.0 B
white
88 
asian
 
8
black
 
6
other
 
3
native
 
1

Length

Max length6
Median length5
Mean length5.009434
Min length5

Characters and Unicode

Total characters531
Distinct characters15
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)0.9%

Sample

1st rowwhite
2nd rowwhite
3rd rowwhite
4th rowasian
5th rowwhite

Common Values

ValueCountFrequency (%)
white 88
83.0%
asian 8
 
7.5%
black 6
 
5.7%
other 3
 
2.8%
native 1
 
0.9%

Length

2024-12-03T00:18:23.632628image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-12-03T00:18:23.738254image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
ValueCountFrequency (%)
white 88
83.0%
asian 8
 
7.5%
black 6
 
5.7%
other 3
 
2.8%
native 1
 
0.9%

Most occurring characters

ValueCountFrequency (%)
i 97
18.3%
e 92
17.3%
t 92
17.3%
h 91
17.1%
w 88
16.6%
a 23
 
4.3%
n 9
 
1.7%
s 8
 
1.5%
b 6
 
1.1%
l 6
 
1.1%
Other values (5) 19
 
3.6%

Most occurring categories

ValueCountFrequency (%)
(unknown) 531
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
i 97
18.3%
e 92
17.3%
t 92
17.3%
h 91
17.1%
w 88
16.6%
a 23
 
4.3%
n 9
 
1.7%
s 8
 
1.5%
b 6
 
1.1%
l 6
 
1.1%
Other values (5) 19
 
3.6%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 531
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
i 97
18.3%
e 92
17.3%
t 92
17.3%
h 91
17.1%
w 88
16.6%
a 23
 
4.3%
n 9
 
1.7%
s 8
 
1.5%
b 6
 
1.1%
l 6
 
1.1%
Other values (5) 19
 
3.6%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 531
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
i 97
18.3%
e 92
17.3%
t 92
17.3%
h 91
17.1%
w 88
16.6%
a 23
 
4.3%
n 9
 
1.7%
s 8
 
1.5%
b 6
 
1.1%
l 6
 
1.1%
Other values (5) 19
 
3.6%

ethnicity
Categorical

Distinct2
Distinct (%)1.9%
Missing0
Missing (%)0.0%
Memory size980.0 B
nonhispanic
90 
hispanic
16 

Length

Max length11
Median length11
Mean length10.54717
Min length8

Characters and Unicode

Total characters1118
Distinct characters8
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rownonhispanic
2nd rownonhispanic
3rd rownonhispanic
4th rownonhispanic
5th rownonhispanic

Common Values

ValueCountFrequency (%)
nonhispanic 90
84.9%
hispanic 16
 
15.1%

Length

2024-12-03T00:18:23.872498image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-12-03T00:18:23.989239image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
ValueCountFrequency (%)
nonhispanic 90
84.9%
hispanic 16
 
15.1%

Most occurring characters

ValueCountFrequency (%)
n 286
25.6%
i 212
19.0%
h 106
 
9.5%
s 106
 
9.5%
a 106
 
9.5%
p 106
 
9.5%
c 106
 
9.5%
o 90
 
8.1%

Most occurring categories

ValueCountFrequency (%)
(unknown) 1118
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
n 286
25.6%
i 212
19.0%
h 106
 
9.5%
s 106
 
9.5%
a 106
 
9.5%
p 106
 
9.5%
c 106
 
9.5%
o 90
 
8.1%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 1118
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
n 286
25.6%
i 212
19.0%
h 106
 
9.5%
s 106
 
9.5%
a 106
 
9.5%
p 106
 
9.5%
c 106
 
9.5%
o 90
 
8.1%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 1118
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
n 286
25.6%
i 212
19.0%
h 106
 
9.5%
s 106
 
9.5%
a 106
 
9.5%
p 106
 
9.5%
c 106
 
9.5%
o 90
 
8.1%

gender
Categorical

High correlation 

Distinct2
Distinct (%)1.9%
Missing0
Missing (%)0.0%
Memory size980.0 B
F
59 
M
47 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters106
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowM
2nd rowM
3rd rowM
4th rowF
5th rowM

Common Values

ValueCountFrequency (%)
F 59
55.7%
M 47
44.3%

Length

2024-12-03T00:18:24.078944image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-12-03T00:18:24.166736image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
ValueCountFrequency (%)
f 59
55.7%
m 47
44.3%

Most occurring characters

ValueCountFrequency (%)
F 59
55.7%
M 47
44.3%

Most occurring categories

ValueCountFrequency (%)
(unknown) 106
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
F 59
55.7%
M 47
44.3%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 106
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
F 59
55.7%
M 47
44.3%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 106
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
F 59
55.7%
M 47
44.3%
Distinct83
Distinct (%)78.3%
Missing0
Missing (%)0.0%
Memory size980.0 B
2024-12-03T00:18:24.376481image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/

Length

Max length40
Median length36
Mean length27.707547
Min length18

Characters and Unicode

Total characters2937
Distinct characters49
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique70 ?
Unique (%)66.0%

Sample

1st rowBoston Massachusetts US
2nd rowChicopee Massachusetts US
3rd rowSpencer Massachusetts US
4th rowFranklin Massachusetts US
5th rowBrockton Massachusetts US
ValueCountFrequency (%)
us 91
26.0%
massachusetts 91
26.0%
boston 11
 
3.1%
north 6
 
1.7%
santiago 6
 
1.7%
do 4
 
1.1%
de 3
 
0.9%
puerto 3
 
0.9%
los 3
 
0.9%
pr 3
 
0.9%
Other values (106) 129
36.9%
2024-12-03T00:18:24.723680image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
456
15.5%
s 408
13.9%
a 273
 
9.3%
t 259
 
8.8%
e 182
 
6.2%
h 127
 
4.3%
u 118
 
4.0%
c 116
 
3.9%
S 112
 
3.8%
o 111
 
3.8%
Other values (39) 775
26.4%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2937
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
456
15.5%
s 408
13.9%
a 273
 
9.3%
t 259
 
8.8%
e 182
 
6.2%
h 127
 
4.3%
u 118
 
4.0%
c 116
 
3.9%
S 112
 
3.8%
o 111
 
3.8%
Other values (39) 775
26.4%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2937
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
456
15.5%
s 408
13.9%
a 273
 
9.3%
t 259
 
8.8%
e 182
 
6.2%
h 127
 
4.3%
u 118
 
4.0%
c 116
 
3.9%
S 112
 
3.8%
o 111
 
3.8%
Other values (39) 775
26.4%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2937
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
456
15.5%
s 408
13.9%
a 273
 
9.3%
t 259
 
8.8%
e 182
 
6.2%
h 127
 
4.3%
u 118
 
4.0%
c 116
 
3.9%
S 112
 
3.8%
o 111
 
3.8%
Other values (39) 775
26.4%

address
Text

Unique 

Distinct106
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size980.0 B
2024-12-03T00:18:24.989003image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/

Length

Max length33
Median length27
Mean length20.367925
Min length13

Characters and Unicode

Total characters2159
Distinct characters59
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique106 ?
Unique (%)100.0%

Sample

1st row811 Kihn Viaduct
2nd row975 Pfannerstill Throughway
3rd row548 Heller Lane
4th row160 Fadel Crossroad Apt 65
5th row766 Grant Loaf Unit 15
ValueCountFrequency (%)
unit 16
 
4.0%
apt 12
 
3.0%
suite 12
 
3.0%
road 4
 
1.0%
row 4
 
1.0%
brook 3
 
0.8%
esplanade 3
 
0.8%
rapid 3
 
0.8%
stravenue 3
 
0.8%
street 3
 
0.8%
Other values (289) 337
84.2%
2024-12-03T00:18:25.414586image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
294
 
13.6%
e 153
 
7.1%
a 129
 
6.0%
r 114
 
5.3%
i 103
 
4.8%
n 98
 
4.5%
t 97
 
4.5%
o 86
 
4.0%
s 59
 
2.7%
l 59
 
2.7%
Other values (49) 967
44.8%

Most occurring categories

ValueCountFrequency (%)
(unknown) 2159
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
294
 
13.6%
e 153
 
7.1%
a 129
 
6.0%
r 114
 
5.3%
i 103
 
4.8%
n 98
 
4.5%
t 97
 
4.5%
o 86
 
4.0%
s 59
 
2.7%
l 59
 
2.7%
Other values (49) 967
44.8%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 2159
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
294
 
13.6%
e 153
 
7.1%
a 129
 
6.0%
r 114
 
5.3%
i 103
 
4.8%
n 98
 
4.5%
t 97
 
4.5%
o 86
 
4.0%
s 59
 
2.7%
l 59
 
2.7%
Other values (49) 967
44.8%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 2159
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
294
 
13.6%
e 153
 
7.1%
a 129
 
6.0%
r 114
 
5.3%
i 103
 
4.8%
n 98
 
4.5%
t 97
 
4.5%
o 86
 
4.0%
s 59
 
2.7%
l 59
 
2.7%
Other values (49) 967
44.8%

city
Text

Distinct78
Distinct (%)73.6%
Missing0
Missing (%)0.0%
Memory size980.0 B
2024-12-03T00:18:25.623904image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/

Length

Max length22
Median length16
Mean length8.7830189
Min length4

Characters and Unicode

Total characters931
Distinct characters42
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique61 ?
Unique (%)57.5%

Sample

1st rowBraintree
2nd rowBraintree
3rd rowMattapoisett
4th rowWareham
5th rowGroveland
ValueCountFrequency (%)
boston 7
 
5.8%
west 7
 
5.8%
braintree 4
 
3.3%
yarmouth 3
 
2.5%
tisbury 3
 
2.5%
brookfield 3
 
2.5%
somerville 3
 
2.5%
north 3
 
2.5%
lowell 3
 
2.5%
springfield 3
 
2.5%
Other values (70) 82
67.8%
2024-12-03T00:18:26.044193image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 100
 
10.7%
o 83
 
8.9%
r 75
 
8.1%
t 74
 
7.9%
n 63
 
6.8%
i 51
 
5.5%
a 50
 
5.4%
l 50
 
5.4%
s 37
 
4.0%
d 34
 
3.7%
Other values (32) 314
33.7%

Most occurring categories

ValueCountFrequency (%)
(unknown) 931
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
e 100
 
10.7%
o 83
 
8.9%
r 75
 
8.1%
t 74
 
7.9%
n 63
 
6.8%
i 51
 
5.5%
a 50
 
5.4%
l 50
 
5.4%
s 37
 
4.0%
d 34
 
3.7%
Other values (32) 314
33.7%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 931
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
e 100
 
10.7%
o 83
 
8.9%
r 75
 
8.1%
t 74
 
7.9%
n 63
 
6.8%
i 51
 
5.5%
a 50
 
5.4%
l 50
 
5.4%
s 37
 
4.0%
d 34
 
3.7%
Other values (32) 314
33.7%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 931
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
e 100
 
10.7%
o 83
 
8.9%
r 75
 
8.1%
t 74
 
7.9%
n 63
 
6.8%
i 51
 
5.5%
a 50
 
5.4%
l 50
 
5.4%
s 37
 
4.0%
d 34
 
3.7%
Other values (32) 314
33.7%

state
Categorical

Constant 

Distinct1
Distinct (%)0.9%
Missing0
Missing (%)0.0%
Memory size980.0 B
Massachusetts
106 

Length

Max length13
Median length13
Mean length13
Min length13

Characters and Unicode

Total characters1378
Distinct characters8
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowMassachusetts
2nd rowMassachusetts
3rd rowMassachusetts
4th rowMassachusetts
5th rowMassachusetts

Common Values

ValueCountFrequency (%)
Massachusetts 106
100.0%

Length

2024-12-03T00:18:26.159925image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-12-03T00:18:26.250296image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
ValueCountFrequency (%)
massachusetts 106
100.0%

Most occurring characters

ValueCountFrequency (%)
s 424
30.8%
a 212
15.4%
t 212
15.4%
M 106
 
7.7%
c 106
 
7.7%
h 106
 
7.7%
u 106
 
7.7%
e 106
 
7.7%

Most occurring categories

ValueCountFrequency (%)
(unknown) 1378
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
s 424
30.8%
a 212
15.4%
t 212
15.4%
M 106
 
7.7%
c 106
 
7.7%
h 106
 
7.7%
u 106
 
7.7%
e 106
 
7.7%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 1378
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
s 424
30.8%
a 212
15.4%
t 212
15.4%
M 106
 
7.7%
c 106
 
7.7%
h 106
 
7.7%
u 106
 
7.7%
e 106
 
7.7%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 1378
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
s 424
30.8%
a 212
15.4%
t 212
15.4%
M 106
 
7.7%
c 106
 
7.7%
h 106
 
7.7%
u 106
 
7.7%
e 106
 
7.7%

county
Categorical

High correlation 

Distinct13
Distinct (%)12.3%
Missing0
Missing (%)0.0%
Memory size980.0 B
Middlesex County
29 
Worcester County
13 
Essex County
10 
Norfolk County
Plymouth County
Other values (8)
37 

Length

Max length17
Median length16
Mean length14.830189
Min length12

Characters and Unicode

Total characters1572
Distinct characters32
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)0.9%

Sample

1st rowNorfolk County
2nd rowNorfolk County
3rd rowPlymouth County
4th rowPlymouth County
5th rowEssex County

Common Values

ValueCountFrequency (%)
Middlesex County 29
27.4%
Worcester County 13
12.3%
Essex County 10
 
9.4%
Norfolk County 9
 
8.5%
Plymouth County 8
 
7.5%
Suffolk County 8
 
7.5%
Hampden County 8
 
7.5%
Bristol County 7
 
6.6%
Barnstable County 5
 
4.7%
Dukes County 4
 
3.8%
Other values (3) 5
 
4.7%

Length

2024-12-03T00:18:26.355496image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
county 106
50.0%
middlesex 29
 
13.7%
worcester 13
 
6.1%
essex 10
 
4.7%
norfolk 9
 
4.2%
plymouth 8
 
3.8%
suffolk 8
 
3.8%
hampden 8
 
3.8%
bristol 7
 
3.3%
barnstable 5
 
2.4%
Other values (4) 9
 
4.2%

Most occurring characters

ValueCountFrequency (%)
o 160
 
10.2%
t 139
 
8.8%
u 126
 
8.0%
n 121
 
7.7%
e 117
 
7.4%
y 114
 
7.3%
C 106
 
6.7%
106
 
6.7%
s 82
 
5.2%
l 67
 
4.3%
Other values (22) 434
27.6%

Most occurring categories

ValueCountFrequency (%)
(unknown) 1572
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
o 160
 
10.2%
t 139
 
8.8%
u 126
 
8.0%
n 121
 
7.7%
e 117
 
7.4%
y 114
 
7.3%
C 106
 
6.7%
106
 
6.7%
s 82
 
5.2%
l 67
 
4.3%
Other values (22) 434
27.6%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 1572
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
o 160
 
10.2%
t 139
 
8.8%
u 126
 
8.0%
n 121
 
7.7%
e 117
 
7.4%
y 114
 
7.3%
C 106
 
6.7%
106
 
6.7%
s 82
 
5.2%
l 67
 
4.3%
Other values (22) 434
27.6%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 1572
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
o 160
 
10.2%
t 139
 
8.8%
u 126
 
8.0%
n 121
 
7.7%
e 117
 
7.4%
y 114
 
7.3%
C 106
 
6.7%
106
 
6.7%
s 82
 
5.2%
l 67
 
4.3%
Other values (22) 434
27.6%

fips
Categorical

High correlation 

Distinct14
Distinct (%)13.2%
Missing0
Missing (%)0.0%
Memory size980.0 B
Unknown
35 
25017.0
24 
25021.0
25027.0
25013.0
Other values (9)
24 

Length

Max length7
Median length7
Mean length7
Min length7

Characters and Unicode

Total characters742
Distinct characters14
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4 ?
Unique (%)3.8%

Sample

1st row25021.0
2nd row25021.0
3rd rowUnknown
4th rowUnknown
5th rowUnknown

Common Values

ValueCountFrequency (%)
Unknown 35
33.0%
25017.0 24
22.6%
25021.0 8
 
7.5%
25027.0 8
 
7.5%
25013.0 7
 
6.6%
25025.0 7
 
6.6%
25009.0 5
 
4.7%
25001.0 4
 
3.8%
25003.0 2
 
1.9%
25005.0 2
 
1.9%
Other values (4) 4
 
3.8%

Length

2024-12-03T00:18:26.452698image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
unknown 35
33.0%
25017.0 24
22.6%
25021.0 8
 
7.5%
25027.0 8
 
7.5%
25013.0 7
 
6.6%
25025.0 7
 
6.6%
25009.0 5
 
4.7%
25001.0 4
 
3.8%
25003.0 2
 
1.9%
25005.0 2
 
1.9%
Other values (4) 4
 
3.8%

Most occurring characters

ValueCountFrequency (%)
0 156
21.0%
n 105
14.2%
2 94
12.7%
5 80
10.8%
. 71
9.6%
1 46
 
6.2%
w 35
 
4.7%
o 35
 
4.7%
k 35
 
4.7%
U 35
 
4.7%
Other values (4) 50
 
6.7%

Most occurring categories

ValueCountFrequency (%)
(unknown) 742
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
0 156
21.0%
n 105
14.2%
2 94
12.7%
5 80
10.8%
. 71
9.6%
1 46
 
6.2%
w 35
 
4.7%
o 35
 
4.7%
k 35
 
4.7%
U 35
 
4.7%
Other values (4) 50
 
6.7%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 742
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
0 156
21.0%
n 105
14.2%
2 94
12.7%
5 80
10.8%
. 71
9.6%
1 46
 
6.2%
w 35
 
4.7%
o 35
 
4.7%
k 35
 
4.7%
U 35
 
4.7%
Other values (4) 50
 
6.7%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 742
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
0 156
21.0%
n 105
14.2%
2 94
12.7%
5 80
10.8%
. 71
9.6%
1 46
 
6.2%
w 35
 
4.7%
o 35
 
4.7%
k 35
 
4.7%
U 35
 
4.7%
Other values (4) 50
 
6.7%

zip
Real number (ℝ)

High correlation  Zeros 

Distinct61
Distinct (%)57.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1317.8868
Minimum0
Maximum2861
Zeros35
Zeros (%)33.0%
Negative0
Negative (%)0.0%
Memory size980.0 B
2024-12-03T00:18:26.563208image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median1605
Q32154.25
95-th percentile2636.25
Maximum2861
Range2861
Interquartile range (IQR)2154.25

Descriptive statistics

Standard deviation1012.0357
Coefficient of variation (CV)0.7679231
Kurtosis-1.539062
Mean1317.8868
Median Absolute Deviation (MAD)602
Skewness-0.29374171
Sum139696
Variance1024216.3
MonotonicityNot monotonic
2024-12-03T00:18:26.699386image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 35
33.0%
2184 4
 
3.8%
2138 2
 
1.9%
2492 2
 
1.9%
1915 2
 
1.9%
1030 2
 
1.9%
2155 2
 
1.9%
2180 2
 
1.9%
2468 2
 
1.9%
1585 2
 
1.9%
Other values (51) 51
48.1%
ValueCountFrequency (%)
0 35
33.0%
1020 1
 
0.9%
1030 2
 
1.9%
1038 1
 
0.9%
1040 1
 
0.9%
1108 1
 
0.9%
1129 1
 
0.9%
1199 1
 
0.9%
1201 1
 
0.9%
1220 1
 
0.9%
ValueCountFrequency (%)
2861 1
0.9%
2743 1
0.9%
2718 1
0.9%
2675 1
0.9%
2673 1
0.9%
2638 1
0.9%
2631 1
0.9%
2492 2
1.9%
2476 1
0.9%
2472 1
0.9%

lat
Real number (ℝ)

High correlation  Unique 

Distinct106
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean42.242437
Minimum41.296177
Maximum42.816043
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size980.0 B
2024-12-03T00:18:26.822922image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/

Quantile statistics

Minimum41.296177
5-th percentile41.650588
Q142.103036
median42.289558
Q342.449431
95-th percentile42.685496
Maximum42.816043
Range1.519866
Interquartile range (IQR)0.34639512

Descriptive statistics

Standard deviation0.3301513
Coefficient of variation (CV)0.0078156311
Kurtosis0.36660563
Mean42.242437
Median Absolute Deviation (MAD)0.18650434
Skewness-0.82384894
Sum4477.6983
Variance0.10899988
MonotonicityNot monotonic
2024-12-03T00:18:26.948894image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
42.42176351 1
 
0.9%
42.21114203 1
 
0.9%
41.43371899 1
 
0.9%
42.55418125 1
 
0.9%
42.0658866 1
 
0.9%
42.11271886 1
 
0.9%
42.48040619 1
 
0.9%
42.34024487 1
 
0.9%
42.04613216 1
 
0.9%
42.23763486 1
 
0.9%
Other values (96) 96
90.6%
ValueCountFrequency (%)
41.29617684 1
0.9%
41.35420569 1
0.9%
41.43371899 1
0.9%
41.49098336 1
0.9%
41.53862265 1
0.9%
41.64829219 1
0.9%
41.65747489 1
0.9%
41.66687013 1
0.9%
41.66850729 1
0.9%
41.68058212 1
0.9%
ValueCountFrequency (%)
42.81604282 1
0.9%
42.74496939 1
0.9%
42.73418302 1
0.9%
42.71912524 1
0.9%
42.69560171 1
0.9%
42.68816668 1
0.9%
42.67748594 1
0.9%
42.67584335 1
0.9%
42.65765376 1
0.9%
42.64079354 1
0.9%

lon
Real number (ℝ)

High correlation  Unique 

Distinct106
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean-71.328445
Minimum-73.237785
Maximum-70.107363
Zeros0
Zeros (%)0.0%
Negative106
Negative (%)100.0%
Memory size980.0 B
2024-12-03T00:18:27.092980image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/

Quantile statistics

Minimum-73.237785
5-th percentile-72.616709
Q1-71.522152
median-71.148032
Q3-71.002221
95-th percentile-70.618146
Maximum-70.107363
Range3.1304225
Interquartile range (IQR)0.51993114

Descriptive statistics

Standard deviation0.62739907
Coefficient of variation (CV)-0.0087959169
Kurtosis0.94806933
Mean-71.328445
Median Absolute Deviation (MAD)0.17323482
Skewness-1.0377175
Sum-7560.8151
Variance0.39362959
MonotonicityNot monotonic
2024-12-03T00:18:27.230930image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-71.00480709 1
 
0.9%
-71.04580218 1
 
0.9%
-70.66515525 1
 
0.9%
-71.14725027 1
 
0.9%
-72.25779487 1
 
0.9%
-70.98132689 1
 
0.9%
-70.8970629 1
 
0.9%
-71.08674318 1
 
0.9%
-71.00135889 1
 
0.9%
-71.018562 1
 
0.9%
Other values (96) 96
90.6%
ValueCountFrequency (%)
-73.23778524 1
0.9%
-73.11924941 1
0.9%
-72.65371344 1
0.9%
-72.63252116 1
0.9%
-72.62755691 1
0.9%
-72.61762464 1
0.9%
-72.61396084 1
0.9%
-72.5987263 1
0.9%
-72.56676186 1
0.9%
-72.54947692 1
0.9%
ValueCountFrequency (%)
-70.10736269 1
0.9%
-70.17711054 1
0.9%
-70.21884617 1
0.9%
-70.23283428 1
0.9%
-70.28956236 1
0.9%
-70.61376021 1
0.9%
-70.63130157 1
0.9%
-70.65792862 1
0.9%
-70.66515525 1
0.9%
-70.71161606 1
0.9%

healthcare_expenses
Real number (ℝ)

High correlation  Unique 

Distinct106
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean202248.22
Minimum527.54
Maximum1157946.9
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size980.0 B
2024-12-03T00:18:27.358654image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/

Quantile statistics

Minimum527.54
5-th percentile4963.3575
Q129803.083
median93132.02
Q3243127.97
95-th percentile838508.74
Maximum1157946.9
Range1157419.4
Interquartile range (IQR)213324.89

Descriptive statistics

Standard deviation263109.59
Coefficient of variation (CV)1.3009242
Kurtosis3.2083286
Mean202248.22
Median Absolute Deviation (MAD)80789.405
Skewness1.9370994
Sum21438311
Variance6.9226658 × 1010
MonotonicityNot monotonic
2024-12-03T00:18:27.508714image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
655129.7 1
 
0.9%
56904.96 1
 
0.9%
1157946.95 1
 
0.9%
65711 1
 
0.9%
442640.29 1
 
0.9%
226219.18 1
 
0.9%
306535.33 1
 
0.9%
143573.39 1
 
0.9%
3895.86 1
 
0.9%
56729.23 1
 
0.9%
Other values (96) 96
90.6%
ValueCountFrequency (%)
527.54 1
0.9%
2460.22 1
0.9%
3895.86 1
0.9%
3969.77 1
0.9%
4416.06 1
0.9%
4871.79 1
0.9%
5238.06 1
0.9%
7380.74 1
0.9%
7617.33 1
0.9%
8531.67 1
0.9%
ValueCountFrequency (%)
1157946.95 1
0.9%
1068387.92 1
0.9%
976441.28 1
0.9%
955755.57 1
0.9%
934784.64 1
0.9%
849950.23 1
0.9%
804184.27 1
0.9%
754029.94 1
0.9%
677634.54 1
0.9%
655129.7 1
0.9%

healthcare_coverage
Real number (ℝ)

High correlation  Zeros 

Distinct100
Distinct (%)94.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean298700.27
Minimum0
Maximum1441488.7
Zeros7
Zeros (%)6.6%
Negative0
Negative (%)0.0%
Memory size980.0 B
2024-12-03T00:18:27.655144image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q113139.33
median187893.31
Q3439327.89
95-th percentile987837.51
Maximum1441488.7
Range1441488.7
Interquartile range (IQR)426188.56

Descriptive statistics

Standard deviation342822.6
Coefficient of variation (CV)1.1477144
Kurtosis1.075759
Mean298700.27
Median Absolute Deviation (MAD)180653.3
Skewness1.3171977
Sum31662228
Variance1.1752733 × 1011
MonotonicityNot monotonic
2024-12-03T00:18:27.792226image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 7
 
6.6%
185658.8 1
 
0.9%
9394.81 1
 
0.9%
12883.79 1
 
0.9%
3361.88 1
 
0.9%
119973.56 1
 
0.9%
26025.55 1
 
0.9%
230741.1 1
 
0.9%
190525.37 1
 
0.9%
77976.38 1
 
0.9%
Other values (90) 90
84.9%
ValueCountFrequency (%)
0 7
6.6%
539.02 1
 
0.9%
640.19 1
 
0.9%
693.39 1
 
0.9%
1075.06 1
 
0.9%
1941.56 1
 
0.9%
3361.88 1
 
0.9%
3414.68 1
 
0.9%
4193.28 1
 
0.9%
4560.98 1
 
0.9%
ValueCountFrequency (%)
1441488.68 1
0.9%
1280069.64 1
0.9%
1267791.2 1
0.9%
1106488.24 1
0.9%
1015294.53 1
0.9%
1009957.09 1
0.9%
921478.76 1
0.9%
900011.77 1
0.9%
884243.46 1
0.9%
878644.32 1
0.9%

income
Real number (ℝ)

High correlation 

Distinct100
Distinct (%)94.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean106080.06
Minimum7361
Maximum816851
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size980.0 B
2024-12-03T00:18:27.937033image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/

Quantile statistics

Minimum7361
5-th percentile10271.75
Q135910.25
median76761.5
Q3117661
95-th percentile198502
Maximum816851
Range809490
Interquartile range (IQR)81750.75

Descriptive statistics

Standard deviation139939.05
Coefficient of variation (CV)1.3191834
Kurtosis14.847463
Mean106080.06
Median Absolute Deviation (MAD)41088.5
Skewness3.7350062
Sum11244486
Variance1.9582938 × 1010
MonotonicityNot monotonic
2024-12-03T00:18:28.078029image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
90297 2
 
1.9%
95344 2
 
1.9%
7361 2
 
1.9%
58212 2
 
1.9%
92537 2
 
1.9%
49737 2
 
1.9%
163299 1
 
0.9%
64743 1
 
0.9%
550030 1
 
0.9%
83325 1
 
0.9%
Other values (90) 90
84.9%
ValueCountFrequency (%)
7361 2
1.9%
7873 1
0.9%
8615 1
0.9%
8752 1
0.9%
10135 1
0.9%
10682 1
0.9%
12128 1
0.9%
16969 1
0.9%
17382 1
0.9%
18258 1
0.9%
ValueCountFrequency (%)
816851 1
0.9%
762068 1
0.9%
742063 1
0.9%
550030 1
0.9%
545255 1
0.9%
198522 1
0.9%
198442 1
0.9%
189277 1
0.9%
188023 1
0.9%
179090 1
0.9%

income_category
Categorical

High correlation 

Distinct3
Distinct (%)2.8%
Missing0
Missing (%)0.0%
Memory size980.0 B
high-income
45 
low-income
33 
medium-income
28 

Length

Max length13
Median length11
Mean length11.216981
Min length10

Characters and Unicode

Total characters1189
Distinct characters13
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowhigh-income
2nd rowmedium-income
3rd rowhigh-income
4th rowlow-income
5th rowmedium-income

Common Values

ValueCountFrequency (%)
high-income 45
42.5%
low-income 33
31.1%
medium-income 28
26.4%

Length

2024-12-03T00:18:28.228746image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2024-12-03T00:18:28.322162image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
ValueCountFrequency (%)
high-income 45
42.5%
low-income 33
31.1%
medium-income 28
26.4%

Most occurring characters

ValueCountFrequency (%)
i 179
15.1%
m 162
13.6%
o 139
11.7%
e 134
11.3%
n 106
8.9%
- 106
8.9%
c 106
8.9%
h 90
7.6%
g 45
 
3.8%
l 33
 
2.8%
Other values (3) 89
7.5%

Most occurring categories

ValueCountFrequency (%)
(unknown) 1189
100.0%

Most frequent character per category

(unknown)
ValueCountFrequency (%)
i 179
15.1%
m 162
13.6%
o 139
11.7%
e 134
11.3%
n 106
8.9%
- 106
8.9%
c 106
8.9%
h 90
7.6%
g 45
 
3.8%
l 33
 
2.8%
Other values (3) 89
7.5%

Most occurring scripts

ValueCountFrequency (%)
(unknown) 1189
100.0%

Most frequent character per script

(unknown)
ValueCountFrequency (%)
i 179
15.1%
m 162
13.6%
o 139
11.7%
e 134
11.3%
n 106
8.9%
- 106
8.9%
c 106
8.9%
h 90
7.6%
g 45
 
3.8%
l 33
 
2.8%
Other values (3) 89
7.5%

Most occurring blocks

ValueCountFrequency (%)
(unknown) 1189
100.0%

Most frequent character per block

(unknown)
ValueCountFrequency (%)
i 179
15.1%
m 162
13.6%
o 139
11.7%
e 134
11.3%
n 106
8.9%
- 106
8.9%
c 106
8.9%
h 90
7.6%
g 45
 
3.8%
l 33
 
2.8%
Other values (3) 89
7.5%

Interactions

2024-12-03T00:18:17.417386image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2024-12-03T00:18:14.507176image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2024-12-03T00:18:15.087750image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2024-12-03T00:18:15.747390image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2024-12-03T00:18:16.270213image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2024-12-03T00:18:16.824560image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2024-12-03T00:18:17.511302image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2024-12-03T00:18:14.602521image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2024-12-03T00:18:15.177838image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2024-12-03T00:18:15.834908image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2024-12-03T00:18:16.360004image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2024-12-03T00:18:16.924269image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2024-12-03T00:18:17.602863image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2024-12-03T00:18:14.693129image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2024-12-03T00:18:15.347950image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2024-12-03T00:18:15.923385image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2024-12-03T00:18:16.454047image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2024-12-03T00:18:17.023936image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2024-12-03T00:18:17.695041image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2024-12-03T00:18:14.779271image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2024-12-03T00:18:15.443270image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2024-12-03T00:18:16.006248image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2024-12-03T00:18:16.545246image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2024-12-03T00:18:17.119461image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2024-12-03T00:18:17.799313image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2024-12-03T00:18:14.870539image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2024-12-03T00:18:15.532570image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2024-12-03T00:18:16.088300image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2024-12-03T00:18:16.637741image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2024-12-03T00:18:17.215528image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2024-12-03T00:18:17.896246image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2024-12-03T00:18:14.984067image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2024-12-03T00:18:15.637001image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2024-12-03T00:18:16.171156image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2024-12-03T00:18:16.730837image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
2024-12-03T00:18:17.318505image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/

Correlations

2024-12-03T00:18:28.402366image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
countyethnicityfipsgenderhealthcare_coveragehealthcare_expensesincomeincome_categorylatlonmaidenmaritalprefixracezip
county1.0000.2870.7780.0000.1840.2560.0000.1960.5140.6680.0000.0930.0000.2340.609
ethnicity0.2871.0000.3360.0000.0370.2820.0000.0000.0000.3140.1950.0000.2270.0000.281
fips0.7780.3361.0000.2370.0000.0000.1080.2480.3230.5840.0000.0130.0000.0000.798
gender0.0000.0000.2371.0000.2610.0000.0890.0000.2790.1360.1400.0000.8540.0000.000
healthcare_coverage0.1840.0370.0000.2611.0000.490-0.0700.196-0.1950.1990.6580.3870.4000.158-0.063
healthcare_expenses0.2560.2820.0000.0000.4901.0000.2420.000-0.1400.1210.6230.3900.2800.331-0.032
income0.0000.0000.1080.089-0.0700.2421.0000.686-0.0320.0270.0000.0540.0000.000-0.149
income_category0.1960.0000.2480.0000.1960.0000.6861.0000.1370.0890.0000.1460.0000.0000.179
lat0.5140.0000.3230.279-0.195-0.140-0.0320.1371.000-0.2600.4090.0980.1180.3240.073
lon0.6680.3140.5840.1360.1990.1210.0270.089-0.2601.0000.0000.0000.0120.2240.130
maiden0.0000.1950.0000.1400.6580.6230.0000.0000.4090.0001.0000.3720.2620.1890.000
marital0.0930.0000.0130.0000.3870.3900.0540.1460.0980.0000.3721.0000.5650.0000.000
prefix0.0000.2270.0000.8540.4000.2800.0000.0000.1180.0120.2620.5651.0000.0000.000
race0.2340.0000.0000.0000.1580.3310.0000.0000.3240.2240.1890.0000.0001.0000.151
zip0.6090.2810.7980.000-0.063-0.032-0.1490.1790.0730.1300.0000.0000.0000.1511.000

Missing values

2024-12-03T00:18:18.063588image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
A simple visualization of nullity by column.
2024-12-03T00:18:18.445119image/svg+xmlMatplotlib v3.9.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

idbirthdatedeathdatessndriverspassportprefixfirstnamemiddlenamelastnamesuffixmaidenmaritalraceethnicitygenderbirthplaceaddresscitystatecountyfipsziplatlonhealthcare_expenseshealthcare_coverageincomeincome_category
030a6452c-4297-a1ac-977a-6a23237c7b461994-02-06NaN999-52-8591S99996852X47758697XMr.Joshua658Alvin56Kunde533UnknownUnknownMwhitenonhispanicMBoston Massachusetts US811 Kihn ViaductBraintreeMassachusettsNorfolk County25021.0218442.211142-71.04580256904.9618019.99100511high-income
134a4dcc4-35fb-6ad5-ab98-be285c586a4f1968-08-062009-12-11999-75-3953S99993577X28173268XMr.Bennie663UnknownEbert178UnknownUnknownDwhitenonhispanicMChicopee Massachusetts US975 Pfannerstill ThroughwayBraintreeMassachusettsNorfolk County25021.0218442.255420-70.971016124024.121075.0649737medium-income
27179458e-d6e3-c723-2530-d4acfe1c26682008-12-21NaN999-70-1925UnknownUnknownUnknownHunter736Mckinley734Gerlach374UnknownUnknownUnknownwhitenonhispanicMSpencer Massachusetts US548 Heller LaneMattapoisettMassachusettsPlymouth CountyUnknown041.648292-70.85061945645.066154.94133816high-income
337c177ea-4398-fb7a-29fa-70eb3d6738761994-01-27NaN999-27-9779S99995100X83694889XMrs.Carlyn477Florencia449Williamson769UnknownRogahn59MasiannonhispanicFFranklin Massachusetts US160 Fadel Crossroad Apt 65WarehamMassachusettsPlymouth CountyUnknown041.789096-70.71161612895.15659951.6117382low-income
40fef2411-21f0-a269-82fb-c42b554714052019-07-27NaN999-50-8977UnknownUnknownUnknownRobin66Jeramy610Gleichner915UnknownUnknownUnknownwhitenonhispanicMBrockton Massachusetts US766 Grant Loaf Unit 15GrovelandMassachusettsEssex CountyUnknown042.734183-70.97641018500.025493.5752159medium-income
5ec1a6cad-8825-7b5c-4e14-257c696d5f112019-04-18NaN999-13-4533UnknownUnknownUnknownArthur650UnknownRoberts511UnknownUnknownUnknownwhitenonhispanicMPlymouth Massachusetts US866 Kulas HarborCambridgeMassachusettsMiddlesex County25017.0213842.377781-71.04411214478.23693.3975767medium-income
64569671e-ed39-055f-8e78-422b96c9896b2013-08-10NaN999-40-7708UnknownUnknownUnknownCaryl47Lelia627Kassulke119UnknownUnknownUnknownwhitenonhispanicFEast Falmouth Massachusetts US578 Dickens CampArlingtonMassachusettsMiddlesex County25017.0247642.412276-71.2028599821.1427142.5158294medium-income
7c1acd7ba-dacf-36d2-6010-db89344000001968-08-06NaN999-97-4087S99911538X37637991XMr.Willian804Shelton25Keeling57UnknownUnknownMwhitenonhispanicMMethuen Massachusetts US848 Ebert Knoll Unit 7BraintreeMassachusettsNorfolk County25021.0218442.214009-71.004896175817.6355473.9749737medium-income
83648fb36-1cd1-3641-0b1c-1f00d1e7e7de2006-07-02NaN999-78-1635S99943171UnknownMs.Domenica436UnknownRau926UnknownUnknownUnknownwhitehispanicFMaynard Massachusetts US963 Senger FortHaverhillMassachusettsEssex County25009.0183542.816043-71.05150352933.1611941.4477756medium-income
950ca7edb-0dee-35e6-5d8f-66fbcb0b37c11948-05-28NaN999-27-5104S99941458X59458953XMr.Arnulfo253Jordan900Jaskolski867UnknownUnknownDwhitenonhispanicMBoston Massachusetts US757 Lockman Annex Apt 10GeorgetownMassachusettsEssex CountyUnknown042.695602-70.972510242013.44322768.9735255low-income
idbirthdatedeathdatessndriverspassportprefixfirstnamemiddlenamelastnamesuffixmaidenmaritalraceethnicitygenderbirthplaceaddresscitystatecountyfipsziplatlonhealthcare_expenseshealthcare_coverageincomeincome_category
9698cbb02b-c16a-60e4-1ff0-37c0e45e0e9f2011-09-30NaN999-74-6516UnknownUnknownUnknownMoses679UnknownFriesen796UnknownUnknownUnknownwhitenonhispanicMBoston Massachusetts US689 Bailey Plaza Apt 88BrocktonMassachusettsPlymouth County25023.0235142.046132-71.0013593895.86285495.208615low-income
97d6cc7569-5f31-9648-ec6a-e1162b32b1832008-06-07NaN999-59-4941S99999666UnknownUnknownDiamond340Mirtha993Keebler762UnknownUnknownUnknownwhitenonhispanicFNorth Attleborough Massachusetts US905 Smitham BayBraintreeMassachusettsNorfolk County25021.0218442.237635-71.01856256729.2313905.9594205high-income
98780fe740-20fb-07ee-1fbd-3fafa9f5df912009-08-20NaN999-71-1449UnknownUnknownUnknownStanton715Dion244Kassulke119UnknownUnknownUnknownwhitenonhispanicMTaunton Massachusetts US539 Grady Fork Suite 43LeominsterMassachusettsWorcester County25027.0145342.583332-71.8177543969.7755724.9824218low-income
99cca2c7f0-a2aa-94e5-ccea-cb78a7d386521972-01-25NaN999-36-7955S99988067X10446987XMrs.Margarette462Britt177West559UnknownHeidenreich818DwhitenonhispanicFChelmsford Massachusetts US756 Schaefer Row Apt 84YarmouthMassachusettsBarnstable CountyUnknown041.666870-70.218846119874.42921478.76124775high-income
1003c7e37b0-c610-bc9a-d75a-f782e5dc75982023-01-18NaN999-74-5035UnknownUnknownUnknownLael572Anitra287Schuppe920UnknownUnknownUnknownwhitenonhispanicFBoston Massachusetts US752 Simonis Gate Suite 16HolyokeMassachusettsHampden County25013.0104042.238973-72.6139614871.790.00545255high-income
10137713015-cfb5-bf1a-70eb-970101f323412018-04-09NaN999-80-8977UnknownUnknownUnknownYun266Norah104Ernser583UnknownUnknownUnknownwhitenonhispanicFHolliston Massachusetts US376 Ullrich Knoll Unit 86FairhavenMassachusettsBristol CountyUnknown041.668507-70.89735615979.434193.2835486low-income
102d426334c-a982-3a31-7e0f-ca3c7fe013101960-05-07NaN999-80-9251S99966941X9157439XMrs.Anita473Berta524Sánchez310UnknownRodarte647WwhitehispanicFSantiago de los Caballeros Santiago DO977 White RowBeverlyMassachusettsEssex County25009.0191542.520925-70.873600955755.571280069.6461016medium-income
103cb1b46a1-9cb5-1187-ccc5-9fb7b98aa9571982-12-09NaN999-83-1974S99951357X10229924XMr.Grady603Delmar187Swaniawski813UnknownUnknownMwhitenonhispanicMSpringfield Massachusetts US623 Crooks StreetSharonMassachusettsNorfolk County25021.0206742.143164-71.170529302685.6287202.0163727medium-income
104d1622e8b-d26b-ec81-ffcb-ec4bf2af385b1951-11-222017-08-18999-55-3884S99996090X31384759XMrs.Elna874Dian810Prohaska837UnknownBogisich202DasiannonhispanicFFitchburg Massachusetts US574 Stanton StravenueBostonMassachusettsSuffolk County25025.0212942.322290-71.025025100734.691441488.6892537high-income
105f339a5f7-0b09-3072-2b01-7c8e8ca2c1fc1951-11-22NaN999-66-2146S99975537X26025438XMs.Blanca837Allyn942Reinger292UnknownUnknownSasiannonhispanicFMillis-Clicquot Massachusetts US698 Hagenes AnnexBostonMassachusettsSuffolk County25025.0211642.421764-71.004807655129.70381212.8192537high-income